Range Predecessor and Lempel-Ziv Parsing
نویسندگان
چکیده
The Lempel-Ziv parsing of a string (LZ77 for short) is one of the most important and widely-used algorithmic tools in data compression and string processing. We show that the LempelZiv parsing of a string of length n on an alphabet of size σ can be computed in O(n log log σ) time (O(n) time if we allow randomization) using O(n log σ) bits of working space; that is, using space proportional to that of the input string in bits. The previous fastest algorithm using O(n log σ) space takes O(n(log σ + log log n)) time. We also consider the important rightmost variant of the problem, where the goal is to associate with each phrase of the parsing its most recent occurrence in the input string. We solve this problem in O(n(1 + (log σ/ √ log n)) time, using the same working space as above. The previous best solution for rightmost parsing uses O(n(1 + log σ/ log log n)) time and O(n log n) space. As a bonus, in our solution for rightmost parsing we provide a faster construction method for efficient 2D orthogonal range reporting, which is of independent interest. ⋆ This research is supported by Academy of Finland through grants 258308 and 284598.
منابع مشابه
Lempel-Ziv Dimension for Lempel-Ziv Compression
This paper describes the Lempel-Ziv dimension (Hausdorff like dimension inspired in the LZ78 parsing), its fundamental properties and relation with Hausdorff dimension. It is shown that in the case of individual infinite sequences, the Lempel-Ziv dimension matches with the asymptotical Lempel-Ziv compression ratio. This fact is used to describe results on Lempel-Ziv compression in terms of dime...
متن کاملOn Generalized Digital Search Trees with Applicationsto a Generalized Lempel - Ziv
The goal of this research is twofold: (i) to analyze generalized digital search trees, and (ii) to derive the average proole (i.e., phrase length) of a generalization of the well known parsing algorithm due to Lempel and Ziv. In the generalized Lempel-Ziv parsing scheme, one partitions a sequence of symbols from a nite alphabet into phrases such that the new phrase is the longest substring seen...
متن کاملAsymmetry in Ziv / Lempel ' 78
We the compare the number of phrases created by Ziv/Lempel '78 parsing of a binary sequence and of its reversal. We show that the two parsings can vary by a factor that grows at least as fast as the logarithm of the sequence length. We then show that under a suitable condition, the factor can even become polynomial, and argue that the condition may not be necessary.
متن کاملFaster Lightweight Lempel-Ziv Parsing
We present an algorithm that computes the Lempel-Ziv decomposition in O(n(log σ+log log n)) time and n log σ+ǫn bits of space, where ǫ is a constant rational parameter, n is the length of the input string, and σ is the alphabet size. The n log σ bits in the space bound are for the input string itself which is treated as read-only.
متن کاملUniversal coding of nonstationary sources
In this correspondence we investigate the performance of the Lempel–Ziv incremental parsing scheme on nonstationary sources. We show that it achieves the best rate achievable by a finite-state block coder for the nonstationary source. We also show a similar result for a lossy coding scheme given by Yang and Kieffer which uses a Lempel–Ziv scheme to perform lossy coding.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016